Dynamic Power Management

* High-level view only, not an in-depth tutorial.

* What part of the problem space is addressed, relationships to other
  parts, where it sits relative to other Linux PM technology.


Introducing DPM

* An embedded alternative to cpufreq and ACPI.

* Developed for platforms with numerous power tunables and designed for
  frequent low-latency power state transitions according to low-level
  system state changes.

* Abstracts complex hardware PM features in ways useful to application
  and driver developers.  H/W designers may be unsure which PM
  capabilities S/W developers will want to use.  S/W developers may be
  unsure how to map their application PM needs to the features of a new
  platform.

* Also runs on (some in 2.4 only at this point) IBM PowerPC 405LP,
  Renesas SH Mobile 3/V, Freescale MX21ADS, Intel Centrino...

* Designed and developed in conjunction with researchers with a very
  strong background in low power hardware and software design.

* DPM has been demonstrated to help save power while satisfying
  performance needs.  How well this works depends largely on the
  platform.

* Lives on sourceforge, not in kernel.org.  Community project
  involvement not substantial (but strong commercial interest).  Have
  discussed DPM with cpufreq maintainers in the past and other
  embedded-oriented PM features, will re-approach soon to gauge interest
  in DPM features applied to existing open source technology.


What DPM Does


Clock and Voltage Scaling

* May change frequencies very often (depending on policy) and changes
  will occur in very asynchronous fashion, takes some getting used to.

* Expects a system designer or higher-level software to setup
  appropriate power policies based on product needs.


How Clock and Voltage Scaling is Configured

* Designer chooses operating point based on points validated/supported
  by hardware vendor, and based on the power parameters available in DPM
  for the board.

* Designer chooses policy based on power/performance goals.  Power state
  transition latencies may need be taken into account.

* Generally needs customization for a particular product, although
  example/default setup may be provided by project or vendor.


How Applications Use DPM

* Application power states are mapped to operating points by the policy.

* Example of useful DPM apps: media player adjusts power/performance
  state according to current real time deadlines (set power down while
  drawing next frame, bump up power if approaching deadline for next
  frame).


Device Power Management Interactions

* For example, some platforms cannot enter certain operating points when
  certain devices are powered/in use -- implemented by DPM "device
  constraints".

* DSP power needs in dual-core application and DSP/baseband processor
  systems may be handled similarly: Linux DSP driver constrains power
  state changes that would adversely affect voiceband processing.

* Choose compatible operating point -- implemented by DPM operating
  point "classes".

* Community also discussing ideas for policy-based device power state
  transitions, particularly for complicated case of USB host and
  device controllers and device tree.


System State Interactions

* DPM provides APIs for kernel-to-userspace event notification and
  app-to-power-manager messages.

* DPM interfaces to system suspend example: device constraints reject
  attempt to sleep when device has work to do.


Power Policy Management

* Software may be tied into platform-specific performance/profiling
  registers for automating policy decisions.

* Have previously demo'ed DPM with ARM's Intelligent Energy Management,
  being demo'ed separately today.

* Intel provides IPM for Intel XScale PXA27x on top of DPM.

* MontaVista adding power policy framework to their products that
  implements an infrastructure for developer convenience and that
  automates common policy management chores, with plugins for
  customization.

* Improved power policy management a major selling point for hardware
  and OS vendors.


Relationships to Other PM Tech

* Policy could determine how aggressively to gate clocks.

* Sony also demo'ing safe suspend and fast and clean shutdown: aimed at
  static power management, general system PM concerns apply but not
  necessarily DPM.


Common Design Challenges

* Crafting a good power management solution is not easy.  Can relate
  some common sources of design problems or general frustration from our
  experience working with CE companies on PM strategies.

* No immediate error return on DPM device constraint rejects, etc.

* Current DPM power state transitory, may change at next context switch.
  The obvious operation of "set the power state I want" is usually not
  sufficient to achieve goals -- you must set the policy you want.

* Placing intelligence in kernel is often initially attractive to
  designers, but kernel may lack knowledge of higher-level policy
  ("battery is low, must shut down despite driver's normal
  reservations"), and this model is generally not favored by Linux
  maintainers.

* Device PM features are ACPI-centric (and somewhat PCI-centric),
  developed with general-purpose desktop/server usage in mind.

* Differing interpretations or assumptions related to PM terminology
  (what does "idle" or "sleep" or "warm shutdown" mean?).

* A vast number of options available for architecting a PM solution.
  How to accomplish goals using the relatively low-level tools offered
  by core DPM interfaces?  Do we need to temporarily set a different
  state, or modify a task's state, or switch policy, or switch to a new
  "profile" of several policies based on system state...?

* Hardware complexity: Power transitions with long latencies, especially
  device clocking interactions, may not be appropriate for non-standby
  power states.


Potential Future Directions


Want to Get Involved?

* Kernel PM discussions at OSDL "unadvertised" list linux-pm and also
  LKML.

* The evaluation/reference boards available to OS vendors usually not
  suitable for low power evaluation.  Silicon and board vendors may want
  to take low power evaluation into account -- test probe points, avoid
  features that unavoidably leak power...

Demonstrations

The Intel Centrino notebook used for the presentation runs 2.6.10 and
DPM, snippets of DPM setup scripts shown...

**********
echo create 600M  600  956 > /sys/dpm/op/control
echo create 1G   1000 1228 > /sys/dpm/op/control
echo create 1p4G 1400 1484 > /sys/dpm/op/control

#     The states for the policies are:
#               "idle-task", "idle",
#               "task-4", "task-3", "task-2", "task-1",
#               "task",
#               "task+1", "task+2", "task+3", "task+4"

# LOW-HIGH: 600M idle, 1G task-, and 1p4G task and task+

echo create LOW-HIGH \
	600M    600M                   \
	1G      1G      1G      1G     \
	1p4G                           \
	1p4G    1p4G    1p4G    1p4G   > /sys/dpm/policy/control 

echo LOW-HIGH > /sys/dpm/policy/active

**********

sysfs values from the running system dumped:

# Active policy
$ cat /sys/dpm/policy/active
LOW-HIGH

# Operating point for idle operating state under that policy
$ cat /sys/dpm/policy/LOW-HIGH/idle
600M

# Operating point for task operating state under that policy
$ cat /sys/dpm/policy/LOW-HIGH/task
1p4G

# Power parameters for 600M operating point
$ cat /sys/dpm/op/600M/cpu /sys/dpm/op/600M/v
600
956

